• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹Á¤º¸Ã³¸®ÇÐȸ ³í¹®Áö > Á¤º¸Ã³¸®ÇÐȸ ³í¹®Áö D

Á¤º¸Ã³¸®ÇÐȸ ³í¹®Áö D

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) ÇÑ±Û ÇüÅÂ¼Ò ¹× Å°¿öµå ºÐ¼®¿¡ ±â¹ÝÇÑ À¥ ¹®¼­ ºÐ·ù
¿µ¹®Á¦¸ñ(English Title) Web Document Classification Based on Hangeul Morpheme and Keyword Analyses
ÀúÀÚ(Author) ¹Ú´ÜÈ£   ÃÖ¿ø½Ä   ±èÈ«Á¶   À̼®·æ   Dan-Ho Park   Won-Sik Choi   Hong-Jo Kim   Seok-Lyong Lee  
¿ø¹®¼ö·Ïó(Citation) VOL 19-D NO. 04 PP. 0263 ~ 0270 (2012. 08)
Çѱ۳»¿ë
(Korean Abstract)
ÃÖ±Ù ÃÊ°í¼Ó ÀÎÅͳݰú ´ë¿ë·® µ¥ÀÌÅͺ£À̽º ±â¼úÀÇ ¹ßÀüÀ¸·Î À¥ ¹®¼­ÀÇ ¾çÀÌ Å©°Ô Áõ°¡ÇÏ¿´À¸¸ç, À̸¦ È¿°úÀûÀ¸·Î °ü¸®Çϱâ À§ÇÏ¿© ¹®¼­ÀÇ ÁÖÁ¦º° ÀÚµ¿ ºÐ·ù°¡ Áß¿äÇÑ ¹®Á¦·Î ´ëµÎµÇ°í ÀÖ´Ù. º» ¿¬±¸¿¡¼­´Â ÇÑ±Û ÇüÅÂ¼Ò ¹× Å°¿öµå ºÐ¼®¿¡ ±âÃÊÇÑ ¹®¼­ Ư¼º ÃßÃâ ¹æ¹ýÀ» Á¦¾ÈÇÏ°í, À̸¦ ÀÌ¿ëÇÏ¿© À¥ ¹®¼­¿Í °°Àº ºñ±¸Á¶Àû ¹®¼­ÀÇ ÁÖÁ¦¸¦ ¿¹ÃøÇÏ¿© ¹®¼­¸¦ ÀÚµ¿À¸·Î ºÐ·ùÇÏ´Â ¹æ¹ýÀ» Á¦½ÃÇÑ´Ù. ¸ÕÀú, ¹®¼­ Ư¼º ÃßÃâÀ» À§ÇÏ¿© ÇÑ±Û ÇüÅÂ¼Ò ºÐ¼®±â¸¦ »ç¿ëÇÏ¿© ¿ë¾î¸¦ ¼±º°ÇÏ°í, °¢ ¿ë¾îÀÇ ºóµµ¿Í ÁÖÁ¦ ºÐº°·ÂÀ» ±âÃÊ·Î ÁÖÁ¦ ºÐº° ¿ë¾îÀÎ Å°¿öµå ÁýÇÕÀ» »ý¼ºÇÑ ÈÄ, °¢ Å°¿öµå¿¡ ´ëÇÏ¿© ÁÖÁ¦ ºÐº°·Â¿¡ µû¶ó Á¡¼öÈ­ÇÑ´Ù. ´ÙÀ½À¸·Î, ÃßÃâµÈ ¹®¼­ Ư¼ºÀ» ±âÃÊ·Î »ó¿ë ¼ÒÇÁÆ®¿þ¾î¸¦ »ç¿ëÇÏ¿© ÀÇ»ç °áÁ¤ Æ®¸®, ½Å°æ¸Á ¹× SVMÀÇ ¼¼ °¡Áö ºÐ·ù ¸ðµ¨À» »ý¼ºÇÏ¿´´Ù. ½ÇÇè °á°ú, Á¦¾ÈÇÑ Æ¯¼º ÃßÃâ ¹æ¹ýÀ» ÀÌ¿ëÇÑ ¹®¼­ ºÐ·ù´Â ÀÇ»ç °áÁ¤ Æ®¸® ¸ðµ¨ÀÇ °æ¿ì Æò±Õ Precision 0.90 ¹× Recall 0.84 ·Î »ó´çÇÑ Á¤µµÀÇ ºÐ·ù ¼º´ÉÀ» º¸¿© ÁÖ¾ú´Ù.
¿µ¹®³»¿ë
(English Abstract)
With the current development of high speed Internet and massive database technology, the amount of web documents increases rapidly, and thus, classifying those documents automatically is getting important. In this study, we propose an effective method to extract document features based on Hangeul morpheme and keyword analyses, and to classify non-structured documents automatically by predicting subjects of those documents. To extract document features, first, we select terms using a morpheme analyzer, form the keyword set based on term frequency and subject-discriminating power, and perform the scoring for each keyword using the discriminating power. Then, we generate the classification model by utilizing the commercial software that implements the decision tree, neural network, and SVM(support vector machine). Experimental results show that the proposed feature extraction method has achieved considerable performance, i.e., average precision 0.90 and recall 0.84 in case of the decision tree, in classifying the web documents by subjects.
Å°¿öµå(Keyword) ¹®¼­ Ư¼º ÃßÃâ   ¹®¼­ ºÐ·ù   ÇüÅÂ¼Ò ºÐ¼®   Å°¿öµå ºóµµ ºÐ¼®   Document Feature Extraction   Document Classification   Morpheme Analysis   Keyword Frequency Analysis  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå